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Abstract — We address the problem of optimizing the through- 
put of network coded traffic in mobile networks operating in 
challenging environments where connectivity is intermittent and 
locally available memory space is limited. Random linear network 
coding (RLNC) is shown to be equivalent (across all possible 
initial conditions) to a random message selection strategy where 
nodes are able to exchange buffer occupancy information during 
contacts. This result creates the premises for a tractable analysis 
of RLNC packet spread, which is in turn used for enhancing its 
throughput under broadcast. By exploiting the similarity between 
channel coding and RLNC in intermittently connected networks, 
we show that quite surprisingly, network coding, when not used 
properly, is still significantly underutilizing network resources. 
We propose an enhanced forwarding protocol that increases 
considerably the throughput for practical cases, with negligible 
additional delay. 

I. Introduction 

The paper focuses on improving throughput in intermittently 
connected networks while maintaining low delivery delays. 
Intermittently connected networks (or DTNs - disruption 
tolerant networks) are networks of very mobile, power- and 
memory-constrained devices where connectivity is sporadic. 
This is the model of choice for wireless networks operating 
in challenging conditions (networks of UAVs, disaster relief 
scenarios, etc.). As traditional routing approaches cannot be 
applied in this case (little is known in advance about future 
connectivity), the literature has developed opportunistic (epi- 
demic) forwarding protocols that replicate packets to multiple 
relay nodes in order to optimize the delivery delay and/or 
the chance that packets get delivered to the destination(s) (TJ. 
Increasing throughput, while keeping low delay, is a problem 
of practical interest as it would enable nodes to receive 
more information per time unit, with almost the same delay. 
RLNC has emerged recently as a promising approach for such 
applications. It ameliorates the transmission by introducing the 
diversity of multiple independent combinations in the epidemic 
forwarding. Nevertheless, analyzing and optimizing RLNC 
for DTNs is difficult^. We prove that RLNC in DTNs is 
in fact equivalent to a forwarding algorithm not employing 
network coding, which is much easier to analyze. Using this 
equivalence, we show that RLNC still produces too many 
redundant packets during contacts, thereby underutilizing net- 
work resources (inter-contact times and consequently buffer 
space). Our study shows that transmissions of a backlogged 
source can be conveniently pipelined even though no feedback 
from destinations is available such that significant throughput 
gains can be attained with negligible additional delay. Based 
on these observations, we design and evaluate a forwarding 



protocol with reduced buffer and energy requirements (less 
mobility required for collecting the same number packets). 

Related work: In their seminal paper, Deb et al.\3\ offer an 
in-depth analysis of random linear network coding (RLNC) 
for networks with intermittent contacts. Numerous studies 
have built upon these results, extending them to the case 
of DTNsj4), 0, RLNC is used to improve the average 
delivery ratio within a given time unit. Our work is motivated 
by the observation that these studies extend the conclusions of 
1 3 1 past the assumptions under which those results have been 
obtained, such that they do not hold anymore. In particular, 
all protocols for optimizing throughput or delays in DTNs 
break the initial condition assumption (messages do not have 
equal initial spread), which leads to significant throughput loss. 
Both Lin et al.\6\ and Altman et al.\l\ (which studies network 
coding and Reed-Solomon codes in two-hop DTNs) note the 
similarity between channel coding and data transmission in 
DTNs. We study the implications of this analogy on the 
aforementioned initial condition assumption. 

II. Network Models 

The network model is similar to the one used in The 
network consists of N mobile nodes with the same radio range 
and buffer space B. We consider that a wireless link (contact) 
is established between two nodes when they are in each others' 
radio range. All contacts are bidirectional. Their duration is 
considered to be negligible with respect to the inter-contact 
times, but sufficient enough to allow the transmission of one 
packet in each direction. We consider mainly the case of a 
backlogged source broadcasting data to the entire network and 
then extend the conclusions to multi- and unicast. The source 
aims at maximizing the average throughput at destinations. 
We consider a mobility model with exponential inter-contact 
times of parameter A, which has been validated for a wide 
spectrum of mobility scenarios[8|. Our analysis is however 
not constrained to this type of mobility. 

The backlogged source is considered to have at least v € N 
packets in its buffer. These packets will be called hereafter 
variables and represent original (not coded) source-generated 
packets. Out of them, the source selects every time a set of 
v oldest packets to be transmitted in the network. When the 
transmission of the v packets is considered completed (after a 
fixed time, TTL, set as a function of N), they are deleted by 
all nodes, the source selects the next v packets and repeats the 
operation. Minimizing the time to delivery and the probability 
that not all nodes have decoded the content are desirable. 



All nodes implement random linear network coding over a 
finite field GF(2 k ). The source is assumed to send to nodes 
that it encounters coded packets (one such packet/contact). 
Coded packets are elements in the set of v independent linear 
combinations of v variables (set called packet batch), where 
coefficients are randomly selected from GF(2 k ). Note that 
v < B and the buffer occupancy is described by the number 
of independent linear combinations present in a node's buffer. 
Packets have size K (a multiple of k) and are treated as vectors 
of values from GF(2 k ). During a contact, nodes scale each 
vector (coded packet) in their buffers with randomly selected 
elements from GF(2 k ) and adds them, thereby creating a new 
network coded packet, which is sent to the other node. A node 
is able to decode all variables only when it has received v 
independent linear combinations. We say that a coded packet 
received by a node is innovative if it increases the rank of 
the equation system formed by coded packets in that node's 
buffer. A contact is efficient iff at least one innovative packet 
is transferred. We are analyzing two protocols: one in which 
relay nodes send random linear combinations of coded packets 
stored in their buffer during contacts (as described above) 
and the other where nodes compare their buffersQ and only 
forward to each other (coded) packets selected uniformly at 
random among those not contained by the other. The two 
protocols are denoted by T (true RLNC) and A (a type 
of random message selection), respectively. RLNC schemes 
transport along with packets the random coefficients as well 
as the identities of original variables combined in the coded 
packets, providing therefore a distributed solution(]9], iflOll . It 
can be proven that the overhead of storing and transporting 
these random coefficients is small. Note that A can also be 
used with variables as packets (instead of coded packets), 
as relays do not perform network coding, thus eliminating 
coefficient overhead. V and A are similar to E-NCP, E-RP[6|. 

III. Main Results 

A. Random Message Selection with Feedback vs. RLNC 

The following result shows that the operation of random 
message selection with buffer feedback during contacts (A) is 
almost identical to true RLNC (T). Thus, results for A apply to 
r and vice versa. The equivalence uncovered by this theorem 
can be used for designing optimal distributed network coding 
protocols for intermittently-connected networks, initially under 
the more tractable A, then applied to T. A relies on nodes 
exchanging information about the list of packets in buffers, 
during contacts. Should this capability not be available, T-type 
RLNC offers the distributed counterpart. 

Theorem 3.1: Given identical mobility and initial condi- 
tions (set of packets already disseminated by the source in the 
network and which are prepared to start the epidemic network 
spread), an arbitrarily-selected contact between two nodes A 
and B at time t will have approximately the same probability 
that A delivers a novel packet to B, under both T and A0 

Proof: We use the following notation: for a node w, S~ 
and designate the subspace spanned by the coded packets 

'Using counting Bloom filters 

2 f5l also implies that (T = A = global rarest). 



belonging to this node's buffer, before and after a contact with 
another node v, respectively. It is thus easy to infer[3| that: 

Pr[S+^S-\S-CS-,S-^S-} > (1) 

Pr[dim(S+) > dim{S-)\S- £ S~] > 1-- (2) 

where q = |GF(2 fc )| and u can be any other node. The two 
probabilities describe the way a node w acquires new degrees 
of freedom in its buffer under RLNC with T-type forwarding. 
Each of these probabilities is equal to 1 for A-type forwarding, 
so eq. (Q]i and (O continue to be true even for A. If we 
consider the case of a very large q (q — >• +oo), then T and 
A become identical. For known mobility and known initial 
packet distribution, we can construct a DTMC to capture the 
packet propagation. A state contains an array of size N and an 
element of this array at index i has to store the list of degrees 
of freedom acquired by node i until that time step. There 
are N degrees of freedom under both T and A. Consider a 
contact between nodes A and B, where we analyze only the 
transmission from A to B. From eq. £[), (f2]i this transmission 
is successful iff node A's buffer has one degree of freedom 
not available to B. In both V and A this degree of freedom is 
selected uniformly at random from those available to A and not 
available to B. Thus, the transition probabilities are the same 
for both DTMCs and the two protocols behave identically. In 
a more realistic setting when q is finite, RLNC with T will in 
fact slightly underperform A, because the probabilities in eq. 
O, © will be 1 for A and > 1 — - for T. 

To prove rigorously that the uniform selection of degrees 
of freedom (dimensions) leads to similar behavior of T and 
A, we have to postulate the following elementary theorem, 
known from linear algebra, presented here without proof: 

Every n-dimensional vector space V over some finite 
field F is isomorphic to F n . If v%, V2, V3, . . . ,v n is a basis 
of V, then the mapping cf> : F n — > V : (ai,...,a„) m> 
Yjk=i akVk ' s an isomorphism. 

Observation: Since the choice of basis for V is not unique 
(there are many possibilities) => the above isomorphism is also 
not unique. In fact, we can construct many such isomorphisms. 

Final steps: We need this isomorphism simply because 
tracking the evolution of vector spaces (that is, node buffers) 
during the packet spread process is very challenging. Such 
isomorphisms offer an easy way to label buffers in a consistent 
manner. In particular, we are interested in mapping each buffer 
to a subset of the base {p\, . . . ,p v }, where pi, . . .,£>„ are 
the initial packets at source. We regard each buffer as a 
subspace/subset of the ^-dimensional vector space. Each such 
buffer/subspace is generated by the vectors/packets present in 
it. Note that the labelling will be performed for every node, 
at will hold at every step of the packet spread. However, a 
final point needs to be discussed. One has to observe that we 
cannot simply map all fc-dimensional subspaces to the same 
set of k vectors of the base (actually, to the subspace that 
they generate). This is simply because then all buffers will 
look identical after applying the isomorphism. Based on the 
fact that every intersection of subspaces is also a subspace, 
we can build the mappings/labellings for each node in a way 
that can prevent this problem. To this end, we specify a 



hard constraint requiring that the intersection of subspaces 
be respected even after applying the isomorphism. This can 
be translate as follows: the intersection of any number of 
subspaces (buffers) has to be a subspace of the same dimension 
in the original version and after applying the isomorphism. 
This is effectively the final step of our proof. The attentive 
reader will have already noticed that instead of working with 
coded packets, we have mapped our buffers to sets of original 
packets, thus effectively equating Y to A. ■ 

B. Finding Optimal Spread Ratios 

We analyze how the number of coded packet copies in- 
fluences the instant throughput of broadcast under A-type 
forwarding and extend the result to Y forwarding, uni- and 
multicast. For our mobility model, if coded packets have each 
the same number of copies in the network at the beginning 
of the forwarding, then, for an arbitrary k, Pr [node I has 
a copy of coded packet pk] — Ck — ct.,(\/)£. m Pi (t) is 
the number of copies of pi contained by network at time 



=> A s 



2(N- 1) 



-(N-l) (7) 
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t (not counting the source), and p Pi (t) 
correspondent instant density. We seek to find the relation 
between p Pi (t),i = l,v that maximizes the instantaneous 
throughput. For this, we analyze the efficiency of each node's 
first contact after instant t, arbitrarily chosen. For tractability, 
we first look at the case when each relay node contains exactly 
one coded packet at time t and generalize afterwards. In this 

case, m pi{t) — N — 1, 2~2 P Pi (t) — 1. We exclude w.l.o.g. 

i=l z=l 

the source as its contacts are efficient by definition anyway. If 
A p } is the set of nodes (without the source) containing a copy 
of coded packet pi at time t, \A P }\ = m Pi (t), (V)i = 1, v =>• 

U 4? = N—{s}, A ( P } C\A ( P ] = 0, (V)i + j, where Af- {s} 

i=l 

is the set of all nodes, without the source. For an arbitrary 
£ £ Apt , Pr [next contact of I is inefficient] = p Pi {t) — j^rr 
(efficient with probability 1 — p Pi (t) + jtzt), meaning that 
I has met another node from the same set, no data transfer 
occurred and the waiting time preceding the contact had been 
wasted. 

We are interested in maximizing throughput (maximizing 
the expected number of efficient first contacts of each node 

after instant t). Therefore, under 2~2 Ppj (*) = 1 we maximizqj 



/ (ftpi (*),••• ,Pp„(*)) 



= E m p*( 1 ~ ppJ = 

fc=l 

Using Lagrange multipliers, 



E E (i-pp*(*))= ( 3 ) 



fe=i 



(iV-l)-EPp,(l-PpJ(4) 



fe=i 



MPpi (*))•••) Ppu (*)> Asoi) = /(Ppi(*),---)Pp„(*)) + (5) 



fc=i 



dp Pk 



(N - 1) • (1 - 2 Pp J (6) 



3 Considers bidirectional contacts. Approximation that node densities do 
not change significantly between two contacts verified in Section IlII-CI The 
constant jj^j does not influence the result. 



Replacing A so ; we find that p pi (t) = ... = p Vv (t) = 
p(t) = i. Thus, all densities should be equal at instant t. The 
generalization for the case of more (or less) than 1 packet/node 

V 

on average ( p p . (t) = c, c > 0) is provided by 

i=i 

Theorem 3.2: The following condition is necessary for 
maximizing throughput of a batch of v coded packets in a DTN 
with A-type forwarding: p Pl (t) = p P2 (t) = p Ps (t) = ... = 
p Pv {t) = p(t),(V)t. In other words, regardless of the buffer 
occupancy level, packet densities should be roughly equal to 
ensure maximal throughput. 

Proof: For each node £ define the concept of en- 
tire buffer packet as being the indicator function le : 
{p 1 ,p 2 ,p 3 , ■ ■ ■ ,Pu} -> {0, 1}, le(pk) = 1 the buffer of 
t contains coded packet pk- A contact between £, H S TV is 
efficient <t=> 1( ^ 1^. Using the above argument for entire 
buffer packets we check that p\ t = 1 and therefore 

ieM-{s} 

PifW = Pbuf{t) = ct.,(V)t S Af — {s} is necessary for 
throughput maximization. From this set of equalities at time t, 
considering an arbitrarily chosen but fixed entire buffer packet 
U => Pr[£ G 42] = Pr[U(pi) - l»(pi)] • Pr[li(p 2 ) = 
1«(P2)] • Pr[h(pi) = l»(pi)] • • • ■ • Pr[l t (pu) = l*(pu)] = 
Pbuf But since none of the buffers 

is yet full (maxi ej i/_{ s ) \lg\ = k < N, \lg\ = Kg — £'s buffer 
occupancy) Pr[l £ (p 4l ) = 1}-Pr[l e (p i2 ) = 1}-Pr[l e (p i3 ) = 
1] • ... • Pr[lj{pi Kt ) = 1] = Q = ct.,i 1 ,i 2 ,i3,—,i Ke = 
1, (V)^ G A/" — {s}. This is an equation system with A — 1 
product equations and v < B < N unknowns, where each 
probability is known to be strictly positive. By equalizing 
equations with the same number of factors, we obtain that 
with high probability Pr[le(j>i) = 1] = Prfl^pa) = 1] = 
Pr[U(p 3 ) = 1] = ... = Pr[h(p u ) = 1] = ct.,(V)£ e 
N — {s}, which means that coded packets should have equal 
spreads. ■ 
Remark: The same condition is necessary for providing 
optimal throughput also for unicast and multicast. This is 
because each node will deliver a packet to the target(s) with 
equal probability. 

C. Impact of Packet Counts on Contact Efficiency 

In this paragraph we explain why the assumption of equal 
packet spread cannot be taken for granted and demonstrate 
its performance impact. We define the entropy of rela- 
tive (normalized) coded packet densities at time instant t 

as H(p'(t)) = ±p' Pk (t) .foaX i ), where p'{t) = 

k—1 k 

(P' P1 (*). P' P2 (*), Pp 3 (*), •••>/£„ (*)) and P' Pk [0, 1] Eke the 
normalized counterparts of p Pk (t), (V)fc = 1, v. The entropy 
is close to 1 iff network coded packets have similar instant 
densities in the network. The entropy allows us to quantify 
the discrepancy between densities through a scalar. We analyze 
the evolution with time of H(p'(t)) and use as an example a 

4 The independence assumption is reasonable, given that a coded packet is 
selected by a node randomly from its buffer for transmission during a contact 
(only from those packets that are innovative). 

5 We consider by convention that ■ log(+oo) = 0. 
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(a) Protocol 1 (continuous source transmission) vs. Protocol 2. 
(equal densities) 
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(b) Protocol 1 (continuous source transmission) vs. Protocol 3. 
(equal densities) 

Fig. 1 . Comparison of instantaneous entropies of normalized network coded 
packet densities for a network of 100 nodes 

network with N = 100 nodes, B = 11, A = 0.005. In 
Fig. |l(a)|l(b)] we show how the entropy evolves over time 
using three representative mobility realizations that run until 
all destinations decode the data, for the following protocols: 

1. A benchmark A-type protocol: the source transmits a 
batch of 11 coded packets continuously until all desti- 
nation nodes have decoded the data. No specific measure 
is taken to maintain equal densities; 

2. Another A-type protocol with a batch of 10 coded 
packets, each placed in a separate node before the spread 
is triggered. The source continues transmitting coded 
packets to non-full nodes; 

3. Similar to 2., with the exception that after distributing the 
initial copies and triggering their spread, the source stops 
disseminating data. \j 

Remarks: There are a number of observations, which hold 
in general (also for T). Firstly, high entropies are conserved 
by exponential inter-contact times. Secondly, the delay of 
protocols 2. and 3. is always almost identical, meaning that 
the source intervention does not improve the throughput any- 
more and that high entropy should be sufficient for maxi- 
mizing throughput. Thirdly, when the source injects packets 
in a greedy manner (strategy commonly considered to yield 
minimal delay[6|), the entropy drops significantly, impacting 

6 The same behavior is observed for other combinations of parameters. 
Timeline scaled with N — 2 in plots. 

7 Fig. |l(a)|l(b)] show time-evolving entropies for protocols 2., 3. (equal 
densities) without the time needed by the source to place a cop y of each 
coded packet in disjoint nodes. This will be studied in Section llVl 



overall contact efficiency. 

IV. Improved Forwarding Protocol 

We define the seeding phase of a transmission as the time 
interval used by source to place v independent coded packets 
each on a distinct relay node, for a batch of v variables. The 
time needed for this operation is a random variable T^, where 
rt G N is the identifier of the packet batch (or T s when 
we do not refer to a specific n). Similarly, we define the 
propagation phase of the transmission as the interval in which 
the independent coded packets are forwarded epidemically in 
the network; this step finishes when all destination nodes have 
successfully decoded the packets. The time needed for the 
propagation phase of batch n is another random variable, T% 
(identically distributed as T p ). For every packet batch, the 
propagation phase takes place immediately after the seeding 
phase. The key idea is that the seeding phase of packet batch 
n + 1 can be performed in parallel with the propagation phase 
of the packet batch n. To accomplish this, we need to ensure 
that v = B - 0(C), s.t. B - 0(C) buffer slots are available to 
propagation of batch n and 0(C) are reserved for seeding of 
batch ri + 1. For an each node, these B — v places will host 
packets copied directly from the source. 

Remark: The throughput loss caused by the fact that v ^ B 
is shown to be negligible in comparison to the gain resulting 
from pipelining (see Section [V). In practice B — v € {1, 2}. 

Theorem 4.1: The seeding phase can be completed in Q(v) 
steps (in practice, approximately v consecutive contacts of the 
source) with high probability. At the end of the seeding phase, 
each of the v independent coded packets will be placed on a 
different relay node with high probability. 

Proof: The seeding algorithm performed by the source is 
described in the following. From the original v variables, the 
source constructs v independent coded packets with RLNC. 
Each of these v coded packets is sent by the source only once. 
During j th contact, the j th coded packet pj is sent by source 
to the peer node, j — l,v. To ensure that all packets start 
spreading at roughly the same time (during propagation phase), 
the source specifies that coded packet pj should be forwarded 
only after the estimated time to finish the seeding. In the most 
favorable case, the source encounters a different node every 
time. This happens for B <C JV < N). In this case we can 
set v = B — 1. Let P- be the probability that packet pi will be 
successfully placed in a node not already containing a packet 

from the same batch. Then, P! = ^±1 E[X] = J2 W> 

i=l ; 

where X is a r.v. (the number of steps to perform seeding). 
Thus E\X\ » v for B <C N. The source faces a variant of 
the coupon collector problem for higher B. For this case, we 
can set v = B — 2, and the probability that two coded packets 
of the same batch end up in the same node during seeding is 
much higher. However, the relay will move the extra coded 
packet to another node not already containing a coded packet 
from the same batch) with the first opportunity, which occurs 
with high probability. Therefore, the probability that pi is not 
placed successfully for this case (neither by the source, nor 
by the relay at some point in the future, before the end of the 
seeding phase, which should occur after the v th contact of the 



source) is tt* < (1 - P() • f[ (1 - P» k ) = ■ ft • 

k=i+l k=i+l 

In practice, we work with networks of limited buffers, were 
v < B <C N and therefore 7Tj w 0. This probability is very 
low even when v = \^] . In conclusion, seeding can be done 
on average in v steps successfully. ■ 
Splitting in two phases (seeding and propagation) is suggested 
by the resemblance to channel coding: to approach channel 
(which is analogous to the DTN) capacity, a block of v bits is 
assembled, coded, sent and then decoded by the destination. 
As v — > +oo, the capacity can be approached asymptotically. 

Corollary 4.1: The seeding phase occurs with the minimum 
possible energy consumption for the source. 

No feedback is assumed and reliable packet delivery is 
required even when the source is backlogged. We therefore 
enforce T l v as deadline for the propagation phase and aim 
to achieve full delivery with high probability, before this 
deadline is reached. The time spent in the propagation phase 
is measured from the v th contact of the source (the one that 
delivered the last packet of the batch to the network). The 
probability that the propagation phase will be longer than T l v 
can be obtained using one of the following: 

. Pr[T p > Tl] < ^SP- = e p (Markov's inequality); 

V 

. Pr[TP > T l p ] < inf s e- sT p-M T p(s) = e p , where M x (s) 
is the mgf of variable X (Chernoff's inequality which is 
the tightest, if applicable). 
The derivation for T p (CCDF) is omitted here due to limited 
space and is provided by ifTTl , Moreover, due to the fact 
that packet densities are almost equal at the beginning of 
the propagation phase, the assumptions made in [6| are now 
accurate, allowing easier analytical treatment. The probability 
that there is at least one destination that has not decoded all 
data is e p —> 0, for T l v reasonably large. In Section [V] we 
show that this condition can be achieved already for low T l p 
and therefore throughput is not affected. 

V. Simulation Results 

We test the pipelined-T protocol (with RLNC at interme- 
diary nodes) against the simple T forwarding protocol, which 
also uses coding at intermediary nodes and which should be 
throughput optimal. Fig. |2(a)| shows the additional throughput 
provided by the pipelined protocol. To ensure full delivery, 
we let = maxngj! 100 }{r^}. Surprisingly, pipelining 
is quite close to achieving the throughput capacity (no more 
that one packet, coded or not, can be sent by the source 
during a contact). Fig. |2(b)| shows an extra delay incurred 
by packets due to the seeding phase exceeding the length of 
the propagation phase; its impact is however minimal. Using 
smaller buffers, pipelining can achieve throughputs superior 
to usual RLNC schemes (which need more memory), at the 
cost of a small additional delay. The pipelining protocol can 
be used also with non-coded packets. This is necessary when 
destination nodes only require some of the packets to be 
delivered, and do not need to decode the entire packet batch. 
In this case the overhead associated with transmission of 
coefficients and computations over the finite field is eliminated, 
but the observations from Fig. |2(a)| and |2(b)| remain valid. The 
attempt made in J6| to use equalizing spray counts does not 
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(a) Throughputs (A = 0.005, N = 100) 
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Fig. 2. Performance evaluation of the pipelining protocol (averaged over 100 
source-generated packet batches) 

obtain better delays simply because it still allows a long initial 
low entropy interval, which has a snowball effect. Further 
increasing the throughput by setting v > B is not possible, 
because in broadcast every node must be able to decode the 
transmission. 

Our conclusions would seem to contradict the results of 
Ahlswede et al. fY2\ and Deb et a/.O; this is however not 
the case, because we complement in fact the two papers for 
the case of intermittently connected networks. Firstly, RLNC 
would reach the maxflow-mincut bound when v — >• +oo, 
which would mean very large buffers (not possible in our 
case). Secondly, assumes equal initial packet spread, as- 
sumption which does not hold in DTNs with usual forwarding 
protocols. 

VI. Conclusions 

In this paper we consider the problem of optimizing 
throughput in intermittently connected networks, with minimal 
impact on delay. We specifically address the practical case 
of limited buffers. It is proven that network coding undemti- 
lizes the available resources. Following information theoretical 
hints, we design a practical forwarding protocol relying on 
pipelining, which achieves asymptotically reliable delivery and 
outperforms network coding in throughput, energy consump- 
tion and memory usage with negligible delay overhead. DTNs 
are shown to be very sensitive to initial forwarding conditions 
(in particular, initial number of packet copies). Setting them 
to convenient values is easily achieved and yields significant 
performance gains. On the other hand, trying to control the 
network after the initiation of the forwarding process is much 
more challenging. Our analysis of the single source broadcast 



generalizes to uni- and multicast. A thorough consideration 
of energy constraints, congestion, multiple unsynchronized 
sources, comparison with other coding techniques, improved 
pipelining and applicability of the maximum-entropy principle 
to other mobility models is left for future work. 
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Appendix 

A. Remarks Regarding the Equivalence Between A and T 

To see why the equivalence holds, we can consider the effect 
of the initial packet density distribution on both schemes. Let 
us assume that out of the v coded packets, the source has 
managed to send to the network only ip < v. These packets 
have the normalized density distributions p[, p' 2 , p' 3 , . . . , pL, 
meaning that p'j = 0, (V)j = ip + l,i/. Clearly, due to the 
uniformity of the mobility model, the probability distribution 
for having already received these packets is the same across 
nodes. Let us assume w.l.o.g. that no coded packets from the 
source have been yet coded together. The higher the entropy, 
the higher is the chance that the distribution is very skewed. 
This means that some packets have already achieved high 
spread, while most others have only a few copies in the 
network. Then, the chance that two nodes in the next contact 
have exactly the same buffer content is very high. In this 
case, both V and A generate the same inefficient contacts. 
Even if their buffers are not exactly the same, the overlap 
will be anyway significant. The contact will deliver with a 



high probability an independent packet to destination, but the 
problem is that most contacts in the network generate new 
random vectors from the same very few linear subspaces of 
similar dimension. For this reason, a node having received a 
network coded packet is still very likely to deliver during its 
next contact a vector which is already in the linear subspace 
of the receiving node. The essential observation to be made 
is that r does not promote packets of lower densities better 
than A. A rare packet will be coded together with others 
at basically the same rate as the one at which A promotes 
it. Indeed, the nodes receiving a rare packet will be able 
to deliver new combinations to others (under T), since the 
combinations contain the new packet. But this happens exactly 
the same under A too, anyway. As the buffers will be almost 
identical, the nodes having the rare packets will have to send 
them anyway, just like in T, because almost all the others 
that they have are already present in the nodes they meet. 
In other words, nodes receive new degrees of freedom at the 
same rate, both under A and T. What matters, is that a new 
independent vector has been received, but also the way it was 
obtained. If most nodes receive independent vectors generated 
from the same few bases, then in the next step they will for 
sure deliver redundant packets. As the source disseminates 
the initial base {pi,p2,P3, . . . ,p v } in the network during its 
contacts, it matters which packets of the initial base have 
reached destination nodes, and not the way these packets have 
been combined by the network. The assumption we made 
above that we first regard a network which has not coded yet 
packets together is indeed without loss of generality precisely 
for this reason. These simple facts provide us with the result 
that the behavior of both A and T is almost identical. A can 
therefore be used as a very good approximation for T, where 
this is necessary for tractability reasons. 



B. Impact of Entropy on Contact Efficiency 

Fig. |3(a)|3(c)1 show contact efficiency for the same three 
mobility traces used in Fig. |l(a)||l(bjj | It can be clearly seen 
that high entropies allow the number of efficient contacts 
per unit of time to increase very fast and to remain at high 
levels, therefore improving throughputs, as opposed to the low 
entropy case. Low entropy will always generate much less 
efficient contacts, with negative effects as both ways of the 
bidirectional links established during contacts are affected. 
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Fig. 3. Average number of efficient contacts per unit of time (over a sliding 
window of 50 contacts), for protocols 1. (black) and 3. (blue). 



Bidirectional contacts where both nodes have novel information for the 
other are counted as two efficient contacts. 



